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Abstract 

Understanding design principles of molecular interaction networks 
is an important goal of molecular systems biology. Some insights have 
been gained into features of their network topology through the dis- 
covery of graph theoretic patterns that constrain network dynamics. 
This paper contributes to the identification of patterns in the mech- 
anisms that govern network dynamics. The control of nodes in gene 
regulatory, signaling, and metabolic networks is governed by a variety 
of biochemical mechanisms, with inputs from other network nodes that 
act additively or synergistically. This paper focuses on a certain type 
of logical rule that appears frequently as a regulatory pattern. Within 
the context of the multistate discrete model paradigm, a rule type is 
introduced that reduces to the concept of nested canalyzing function in 
the Boolean network case. It is shown that networks that employ this 
type of multivalued logic exhibit more robust dynamics than random 
networks, with few attractors and short limit cycles. It is also shown 
that the majority of regulatory functions in many published models of 
gene regulatory and signaling networks are nested canalyzing. 

Key words: gene regulation; signaling; mathematical model; nested 
canalyzing function; robustness. 

1 Introduction 

Elucidating the large-scale graph structure of complex molecular inter- 
action networks, from transcriptional networks [2] to protein- protein 
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interaction networks [31^ and metabolic network is an important 
step toward an understanding of design principles of the cellular ar- 
chitecture. For instance, it has been shown that certain graph theo- 
retic network patterns are much more prevalent in such networks than 
could be expected. See, e.g., HO]. The next step is to understand 
cells as complex nonlinear dynamical systems. There now exist many 
dynamic models of gene regulatory, signaling, and metabolic pathways 
that provide snapshots of cellular dynamics using a range of modeling 
platforms. Many of these models represent the interactions of different 
molecular species as logical rules of some type that describe the combi- 
natorics of how the species combine to regulate others; see, e.g., pTlll8) . 
The logical rules of Boolean network models are an example of such a 
description, in which network states are reduced to binary states, with 
a species either present or absent. It was shown in [TSKTB] that a spe- 
cial type of Boolean logical rule which appears frequently in published 
Boolean network models [TT] exhibits robustness properties character- 
istic of molecular networks. These rules, so-called nested canalyzing 
functions, capture the spirit of Waddington's concept of canalyzation 
in gene regulation [3D] . Several other classes of Boolean functions have 
also been investigated in the search for biologically meaningful rules 
to describe molecular interactions, including random functions |14| . hi- 
erarchical canalyzing function j^Hl [12 ; chain functions [5] , and unate 
functions [lO] . 

In many cases the regulatory relationships are too complicated to 
be captured with Boolean logic, and more general models have been 
developed to represent these. Common other discrete model types, in 
addition to Boolean networks, are so-called logical models [28j, Petri 
nets [25], and agent-based models ^3\. In [29] and [12] it was shown 
that many of these models can be translated into the rich and gen- 
eral mathematical framework of polynomial dynamical systems over a 
finite field F. (Software to carry out this translation is available at 
http://dvd.vbi.vt.edu/cgi-bin/git/adain.pl. Since the algebraic 
structure of F is not relevant for our purposes, we will consider a slightly 
more general setup. Let xi, . . . ,Xn be variables, which can take values 
in finite sets Xi, . . . , X„, respectively. Let X = Xi x • • • x X„ be the 
Cartesian product. A dynamical system in the variables xi, . . . , x„ is 
a function 

f=ih,...,fn):X^X 

where each coordinate function fi is a function on a subset of {xi, . . . , x„}, 
and takes on values in Xi. Dynamics is generated by iteration of /. 
As an example, if Xi = {0, 1}, then each fi is a Boolean rule and / is 
a Boolean network. 

Here, we use this very general framework to give a definition of 
the notion of nested canalyzing rule, which then applies to all differ- 
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ent model types simultaneously. We show through extensive simula- 
tions that dynamical systems constructed from such rules as coordinate 
functions have important dynamic properties characteristic of molec- 
ular networks, namely very short limit cycles and very few attractors, 
compared with the set of all possible functions. Furthermore, we show 
that many published models use logical interaction rules whose poly- 
nomial form is nested canalyzing, thereby providing evidence that gen- 
eral nested canalyzing rules represent a frequently occurring pattern 
in molecular network regulation. 

2 Nested Canalyzing Rules 

Here we present the general definition of a nested canalyzing rule in 
variables xi, . . . ,Xn with state space X =^ Xi x ■ ■ ■ x X„. 

Definition. Assume that each Xi is totally ordered, that is, its ele- 
ments can be arranged in linear increasing order. In the Boolean case 
this could be Xi = {0 < 1}. Let Si C Xi, i — 1, . . . ,n, he subsets that 
satisfy the property that each Si is a proper, nonempty subinterval of 
Xi, that is, every element of Xi that lies between two elements of Si 
in the chosen order is also in Si. Furthermore, we assume that the 
complement of each Si is also a subinterval, that is, each Si can be 
described by a threshold Si, with all elements of Si either larger or 
smaller than s^. 

• The function fi : X ^ Xi is a nested canalyzing rule in the vari- 
able order ^^^.(i) , . • . , ^^^(n) with canalyzing input sets Si, ... , 5„ C 
X and canalyzing output values 6i, . . . , 6„, &„+i G Xi with &„ ^ 
bn+i if it can be represented in the form: 



f{xi, ...,Xn)^< 



h if a;cr(i) e Si 

h if a;<T(i) ^ Si,x^^2) e S2 

h if x^n) i 5i,a;^(2) i Si,x„iy. £ S- 



bn if 2;<t(i) i Si,..., a;£r(„) e Sn 

hn+l if X^(i) ^ Si,..., Xc,(n) i- Sn 



• The function fi'.X-^Xiisa nested canalyzing function if it is a 
nested canalyzing function in some variable order x^r^i), . . . , Xcr{n) 
for some permutation a on {1, . . . , n}. 

It is straightforward to verify that, if Xi = {0, 1} for all i, then 
we recover the definition in [TS] of a Boolean nested canalyzing rule. 
As mentioned above, several important classes of multistate discrete 
models can be represented in the form of a dynamical system / : X — > 
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X, so that the concept of a nested canalyzing rule defined in this way 
has broad applicability. 

3 The dynamics of nested canalyzing net- 
works 

Aside from incorporating the biological concept of canalyzation, net- 
works whose nodes are controlled by combinatorial logic expressed by 
nested canalyzing rules have dynamic properties resembling those of 
biological networks. In particular, they are robust, due to the fact that 
they have a small number of attractors, which are therefore large. That 
is, perturbations are more likely to remain in the same attractor. In 
addition, limit cycles tend to be very short, compared to random net- 
works, which implies that these networks have very regular behavior. 
We have performed extensive simulation experiments for this purpose, 
whose results we report here. 

3.1 Computer experiments 

We have generated random network topologies with in-degree distri- 
bution k ranging between 2 and 5, i.e., each node depends on at least 
two inputs and at most on five inputs. This assumption is not unreal- 
istic and is based on the observation that gene regulation networks are 
sparse [T7] . For each network topology we have generated two discrete 
dynamical systems: one where the update rules are all nested canalyz- 
ing and the other where the update rules are randomly chosen (and 
non-degenerate, in the sense that all inputs indicated in the network 
topology are realized). 

3.1.1 Attractor distribution for nested canalyzing net- 
works versus random networks 

We present our results concerning attractor distributions in Figure [1] 
For each histogram, on the x-axis we represent the number of attrac- 
tors, and on the y-axis we represent the percentage of networks that 
show the given number of attractors specified on the x-axis. The pa- 
rameters n, k and p correspond to the number of nodes, the range for 
the in-degree distribution, and the number of states for each node re- 
spectively. For the experiments performed here we have generated the 
in-degree distribution from a uniform distribution, independently for 
each node and network realization. Figure [T] shows very clearly that 
the number of attractors in nested canalyzing networks is dramatically 
smaller than for general networks. Thus, attractor sizes are larger 
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on average than in general networks, leading to more robust behavior 
under perturbations. 

3.1.2 Cycle length distribution for nested canalyzing 
networks versus random networks 

We present the results concerning cycle lengths in Figures [2] - [5l For 
each figure, the upper subfigures show the mean number of attractors 
of length specified on the x-axis (solid lines) and their standard devia- 
tions (dashed lines), the 2:- axis of these subfigures are in a logarithmic 
scale. The bottom left subfigures shows the mean number of attractors 
of lengths specified on the x-axis in a log-log plot, and finally the bot- 
tom right subfigures shows the percentage of networks that returned a 
particular cycle of length specified on the x-axis. The parameters n, k 
and p correspond to the number of nodes, the range for the in-degree 
distribution, and the number of states for each node, respectively. For 
all of the experiments, we have generated the in-degree distribution 
from a uniform distribution, independently for each node and network 
realization. Figures [5] -[5] show clearly that networks with nested cana- 
lyzing rules exhibit significantly smaller cycle lengths, leading to more 
regular behavior. 

4 Nested canalyzing rules are biologically 
meaningful 

We hypothesize that nested canalyzing rules are biologically meaning- 
ful. To test this hypothesis we have explored a range of published 
multi-state models as to their frequency of appearance. Table [T] shows 
that they are indeed very prevalent, providing evidence that the nested 
canalyzation is indeed a common pattern for the regulatory logic in 
molecular interaction networks. To illustrate this phenomenon we dis- 
cuss specific examples. For a complete list of models we have studied 
see the supporting materials. 

4.1 Examples 

4.1.1 Lambda Phage Regulation 

Thieffry and Thomas [27 built a multi-state logical model for the core 
lambda phage regulatory network. This model encompasses the roles 
of the regulatory genes C7, CRO, CII, and N. See Figure [51 

The state space for this model is specified by [0, 2] x [0, 3] x [0, 1] x 
[0,1], that is, the first variable has three levels {0,1,2}, the second 
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variable has four levels {0,1,2,3}, and the third and fourth variables 
are still Boolean. 

The update rule for CI, fci, has inputs CRO and CII which is 
nested canalyzing in the variable order CII, CRO, with canalyzing 
input sets — {1}, and 5*2 = {1,2,3} and canalized output values 
2, 0, 2, i.e., (see the supporting materials for complete truth tables) 

r 2 if CII e Si 
fciiCRO, CII) = I OiiCII ^ Si, CRO e S2 
[ 2 if C// ^ Si, CRO i S-i. 

The update rule for CRO, fcRO, is nested canalyzing in the vari- 
able order CI, CRO, with canalyzing input sets S'l = {2}, and 5*2 = 
{0, 1, 2}, and canalized output values 0, 3, 2, i.e., 

r if C/ e S'l 
fcRoiCI, CRO) ^\ iiiCIi Si, CRO e S2 
[ 2iiCI i Si, CRO i 52. 

The update rule for CII, fciij is nested canalyzing in the variable 
order CII, CRO, N, with canalyzing input sets S'l — {2}, S2 — {3}, 
and S3 — {1}, and canalized output values 0, 0, 1, 0, i.e.. 



{0 if C/ e Si 
if C/ ^ Si, CRO e S'2 
1 if CIiSi,CRO<^S2,NeS3 
if C/ ^ Si,CRO iS2,N (/, S3. 

Finally, the update rule for N, Jn, is nested canalyzing in the 
variable order N, CRO, with canalyzing input sets Si = {1,2}, and 
S'2 = {2,3}, and canalized output values 0,0, 1, i.e., 

if C/ e Si 
fNiCI, CRO) ^ { if C/ ^ Si, CRO e S2 
lifCI i Si, CRO i S2. 

4.2 Regulation in the p53-Mdm2 network 

The following model comes from Abou-Jaude W., Ouattara A., Kauff- 
man M.(2009) 1 . The model represents the interactions of the tu- 
mor supressor protein p53 and its negative regulator Mdm2. Here, P, 
Mn, Mc, and Dam stand for protein p53, nuclear Mdm2, cytoplasmic 
Mdm2, and DNA damage, respectively. 

The state space for this model is specified by [0, 2] x [0, 1] x [0, 1] x 
[0,1], that is, except for the first variable P that has three levels 
{0, 1, 2}, all the other variables are still Boolean. 
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As shown in Figure [71 Mn acts negatively on P. The update rule 
of P, fp, is nested canalyzing with canalyzing input set Si = {0} and 
canalized output values Kp, Kp ^j^/[n}: i-G-j '^e can represent fp as, 



Here Kp is the basal value and ifp,{Mn} is the parameter value under 
the influence of Mn. 

Similarly, for Mc, its update rule, /a/c, is nested canalyzing with 
canalyzing input set 5*1 — {0,1} and canalized output values Kmc, Kmc,{p} , 
i.e., we can represent Jmc as. 



Here Kmc is the basal value and Ki\ic,{p} is the parameter value under 
the influence of P. 

For Mn, a set of possible parameters for its truth table is given 
in [T]. We have checked all these cases and found that for each case 
we either get a nested canalyzing function or a constant function. For 
example, for the second column of Figure 3 (a) in [1] we get that the 
update rule for Mn, fmn, is nested canalyzing in the variable order 
Mc, P with canalyzing input sets 5*1 — {!}, S2 = {1,2} and canalized 
output values 1,0, 1, i.e.. 



When DNA damage is introduced, it has a negative effect on Mn. 
From the set of all possible parameters for its truth table given in 
inequalities (3)-(5) at [T], we check that we can always find a nested 
canalyzing function for its truth table. For example, for the third 
column of Figure 3 (a) in [T] we get that the update rule for Mn 
(under DNA damage) is nested canalyzing in the variable order Ale, 
Dam, P, with canalyzing input sets Si = {1}, 5*2 = {1}, 6*3 = {1,2} 
and canalized output values 1, 0, 0, 1, i.e., we can represent Jmu as. 




fAIciP) - 



Kmc if P e 5i 
Kmc,{p} if P ^ 5*1. 



fMn{P,Mc) 



1 if Mc e Si 

if Mc (^Si,P e S2 

1 \i Mc i Si,P i S2. 



fMn{P,Mc,Dam) 



1 if Mc G Si 

Oif Mc^ Si,Dam e S2 

if Mc ^ Si, Dam ^ 5*2, P G 5*3 

1 if Mc ^ Si, Dam ^ 5*2, P ^ S'3- 



Finally, the update rule for DNA damage, foarm is nested canalyz- 
ing in the variable order P, Dam, with canalyzing input sets Si = {2}, 
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S2 — {1} and canalized output values 0, 1, 0, i.e., 

r 1 if P e S*! 

foamiP, Dam) = \ WiPi Si, Dam £ ^2 
[ if P ^ Si, Dam (j, 82. 

5 Discussion 

In this paper we have given a definition of a nested canalyzing rule, in- 
spired by the special case of Boolean networks, and we have shown that 
it appears as a frequent pattern for the regulatory logic of many molec- 
ular interaction networks. We have shown that this regulatory pattern 
leads to networks that have robust and regular dynamics, as a result 
of having very small numbers of attractors and very short limit cycles, 
compared to random networks. This behavior is also characteristic of 
many molecular interaction networks. An important application of this 
result is to the construction of discrete models, both via a bottom-up 
or a top-down approach. For both approaches, the possibility of re- 
stricting the choice of rules to the family of nested canalyzing rules is 
a significant reduction in the possible model space that is available. 

Another interesting aspect of our results is suggested by |T3] . There 
is was shown that in the Boolean case, the class of nested canalyzing 
Boolean rules is in fact identical to the class of unate cascade functions. 
These functions have been studied extensively in computer engineer- 
ing and have been shown [3] to have the important property that they 
comprise exactly the class of Boolean functions that lead to binary 
decision diagrams with shortest average path length. Thus, they make 
good candidates for the representation of efhcient information process- 
ing. It would be interesting to study this property for general nested 
canalyzing rules. 

The results in [13| were derived by using a special representation of 
Boolean functions, namely as polynomial functions over the Boolean 
number field, with arithmetic given by addition and multiplication 
"tables," with the key rule that 1 + 1 = 0. Using a parametrization of 
the family of all nested canalyzing Boolean polynomials, it was shown 
in |13| that the family of all nested canalyzing polynomials in a given 
number of variables is in fact identical to the class of unate cascade 
functions, which, in turn, is equal to the class of Boolean functions 
that result in binary decision diagrams of shortest average path length 
[3]. In a paper in preparation we have shown that one can give a 
similar parameterization of the variety of general nested canalyzing 
rules, and we use this parameterization to derive a formula for the 
number of nested canalyzing rules for a given number of variables. It 
would be interesting to investigate whether this general class of nested 
canalyzing rules leads to n-ary decision diagrams that have similar 
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Model 


References 


n ^ 




lysis-lysogeny decision in the lambda 


m 


4 


100% 


phage 








p53-Mdm2 regulation 


[ij 


4 


100% 


Signalling pathways controlling Th cell 


m 


42 


92.8% 


differentiation 








Budding yeast exit module 


MM 


9 


77.7% 


Dorsal-ventral boundary formation of the 


m 


24 


75% 


Drosophila wing imaginal disc 








Control of Thl /Th2 cell differentiation 


[19, 4] 


14 


71.4% 


Yeast morphogenetic checkpoint 




8 


50% 



Table 1: Nested canalyzing functions for multi-state models 

^ Number of nodes. Only nodes with in-degree ^ 1 are considered, i.e. 

non-constant nodes. 
^ Percentage of nodes regulated by nested canalyzing functions. 
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n=10, ke [2-5], and p=2 n=20, ke [2-5], and p=2 




Number of attractors Number of attractors 



n=10, ke [2-5], and p=3 n=5, ke [2-5], and p=5 




Number of attractors Number of attractors 



Figure 1: Attractor distribution for networks with nested canalyzing func- 
tions (solid circles) and networks with random functions (open circles). The 
figures show the percentage of networks that returned the number of attrac- 
tors specified in the x-axis. The parameters n, k and p correspond to the 
number of nodes, the range for the in-degree distribution, and the number 
of states for each node, respectively. The figures for n = 5, 10 and p = 2,5 
were generated for 1000000 networks, the figure for n = 10 and p = 3 was 
generated for 100000 networks, and the figure for n = 20 and p = 2 was 
generated for 10000 networks. 
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n=1 0, ke [2-5], and p=2 n=1 0, ke [2-5], and p=2 




1 2 3 20 87 1 2 3 20 87 



Cycle length Cycle length 



Figure 2: Cycle lengtli for networlcs with nested canalyzing functions (solid 
circles) and networks with random functions (open circles). The parameters 
n, k and p correspond to the number of nodes, the range for the in-degree 
distribution, and the number of states for each node, respectively. The 
figures were generated for one million networks. The upper figures show 
the mean number of attractors of length specified in the x-axis (solid lines) 
and their standard deviations (dashed lines), the x-axis of these figures are 
in a logarithmic scale. The bottom left figure shows the mean number 
of attractors of lengths specified in the x-axis in a log-log plot (here rnd 
means random), and finally the bottom right figure shows the percentage of 
networks that returned a particular cycle of length specified on the x-axis. 
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Figure 3: Cycle length for networks with nested canalyzing functions (solid 
circles) and networks with random functions (open circles). The parameters 
n, k and p correspond to the number of nodes, the range for the in-degree 
distribution, and the number of states for each node, respectively. The 
figures were generated for 10000 networks. The upper figures show the mean 
number of attractors of length specified on the x-axis (solid lines) and their 
standard deviations (dashed lines), the x-axis of these figures is a logarithmic 
scale. The bottom left figure shows the mean number of attractors of lengths 
specified on the x-axis in a log-log plot (here rnd means random), and finally 
the bottom right figure shows the percentage of networks that returned a 
particular cycle of length specified on the x-axis. 
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n=1 0, ke [2-5], and p=3 n=1 0, ke [2-5], and p=3 




1 2 3 26 57 1 2 3 26 57 



Cycle length Cycle length 



Figure 4: Cycle lengtli for networlcs with nested canalyzing functions (solid 
circles) and networks with random functions (open circles). The parameters 
n, k and p correspond to the number of nodes, the range for the in-degree 
distribution, and the number of states for each node, respectively. The 
figures were generated for 100000 networks. The upper figures show the 
mean number of attractors of length specified on the x-axis (solid lines) 
and their standard deviations (dashed lines), the x-axis of these figures are 
in a logarithmic scale. The bottom left figure shows the mean number 
of attractors of lengths specified on the x-axis in a log-log plot (here rnd 
means random), and finally the bottom right figure shows the percentage of 
networks that returned a particular cycle of length specified on the x-axis. 
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Figure 5: Cycle length for networks with nested canalyzing functions (solid 
circles) and networks with random functions (open circles). The parameters 
n, k and p correspond to the number of nodes, the range for the in-degree 
distribution, and the number of states for each node, respectively. The 
figures were generated for one million networks. The upper figures show 
the mean number of attractors of length specified on the x-axis (solid lines) 
and their standard deviations (dashed lines), the x-axis of these figures are 
in a logarithmic scale. The bottom left figure shows the mean number 
of attractors of lengths specified on the x-axis in a log-log plot (here rnd 
means random), and finally the bottom right figure shows the percentage of 
networks that returned a particular cycle of length specified on the x-axis. 
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Figure 6: Four-variable model for the lambda phage regulatory network. 




Figure 7: Four-variable model for the p53-Mdm2 regulatory network. 
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